Introduction to Pandas
Pandas is a Python library providing high-performance, easy-to-use data structures and data analysis tools. Pandas provides easy and powerful ways to import data from a variety of sources and export it to just as many. It is also explicitly designed to handle missing data elegantly which is a very common problem in data from the real world.
It can be used to perform the same tasks that you might use spreadsheets such as Excel for: columns of data being combined together with functions being applied to them and finally being displayed as a graph.
The official pandas documentation is very comprehensive and you will be able to answer a lot of questions in there, however, it can sometimes be hard to find the right page. Don’t be afraid to use Google to find help.
Most data analyses will follow a similar series of steps which we will go through in this course:
Pandas provides us with all the tools we need to be able to select, query, filter and combine our data.
We will start with the basics of selcting our data and then move on to more complex selections and how to ask questions of your data.